Competitive Classification and Closeness Testing

نویسندگان

  • Jayadev Acharya
  • Hirakendu Das
  • Ashkan Jafarpour
  • Alon Orlitsky
  • Shengjun Pan
  • Ananda Theertha Suresh
چکیده

We study the problems of classification and closeness testing. A classifier associates a test sequence with the one of two training sequences that was generated by the same distribution. A closeness test determines whether two sequences were generated by the same or by different distributions. For both problems all natural algorithms are symmetric—they make the same decision under all symbol relabelings. With no assumptions on the distributions’ support size or relative distance, we construct a classifier and closeness test that require at most Õ(n) samples to attain the n-sample accuracy of the best symmetric classifier or closeness test designed with knowledge of the underlying distributions. Both algorithms run in time linear in the number of samples. Conversely we also show that for any classifier or closeness test, there are distributions that require Ω̃(n) samples to achieve the n-sample accuracy of the best symmetric algorithm that knows the underlying distributions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pitman-Closeness of Preliminary Test and Some Classical Estimators Based on Records from Two-Parameter Exponential Distribution

In this paper, we study the performance of estimators of parametersof two-parameter exponential distribution based on upper records. The generalized likelihood ratio (GLR) test was used to generate preliminary test estimator (PTE) for both parameters. We have compared the proposed estimator with maximum likelihood (ML) and unbiased estimators (UE) under mean-squared error (MSE) and Pitman me...

متن کامل

Competitive Closeness Testing

We test whether two sequences are generated by the same distribution or by two different ones. Unlike previous work, we make no assumptions on the distributions’ support size. Additionally, we compare our performance to that of the best possible test. We describe an efficiently-computable algorithm based on pattern maximum likelihood that is near optimal whenever the best possible error probabi...

متن کامل

Negative Selection Based Data Classification with Flexible Boundaries

One of the most important artificial immune algorithms is negative selection algorithm, which is an anomaly detection and pattern recognition technique; however, recent research has shown the successful application of this algorithm in data classification. Most of the negative selection methods consider deterministic boundaries to distinguish between self and non-self-spaces. In this paper, two...

متن کامل

Optimizing Cost Function in Imperialist Competitive Algorithm for Path Coverage Problem in Software Testing

Search-based optimization methods have been used for software engineering activities such as software testing. In the field of software testing, search-based test data generation refers to application of meta-heuristic optimization methods to generate test data that cover the code space of a program. Automatic test data generation that can cover all the paths of software is known as a major cha...

متن کامل

An Improvement in Support Vector Machines Algorithm with Imperialism Competitive Algorithm for Text Documents Classification

Due to the exponential growth of electronic texts, their organization and management requires a tool to provide information and data in search of users in the shortest possible time. Thus, classification methods have become very important in recent years. In natural language processing and especially text processing, one of the most basic tasks is automatic text classification. Moreover, text ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012